162 research outputs found

    New decoding algorithms for Hidden Markov Models using distance measures on labellings

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Existing hidden Markov model decoding algorithms do not focus on approximately identifying the sequence feature boundaries.</p> <p>Results</p> <p>We give a set of algorithms to compute the conditional probability of all labellings "near" a reference labelling <it>λ </it>for a sequence <it>y </it>for a variety of definitions of "near". In addition, we give optimization algorithms to find the best labelling for a sequence in the robust sense of having all of its feature boundaries nearly correct. Natural problems in this domain are <it>NP</it>-hard to optimize. For membrane proteins, our algorithms find the approximate topology of such proteins with comparable success to existing programs, while being substantially more accurate in estimating the positions of transmembrane helix boundaries.</p> <p>Conclusion</p> <p>More robust HMM decoding may allow for better analysis of sequence features, in reasonable runtimes.</p

    A Combination of Compositional Index and Genetic Algorithm for Predicting Transmembrane Helical Segments

    Get PDF
    Transmembrane helix (TMH) topology prediction is becoming a focal problem in bioinformatics because the structure of TM proteins is difficult to determine using experimental methods. Therefore, methods that can computationally predict the topology of helical membrane proteins are highly desirable. In this paper we introduce TMHindex, a method for detecting TMH segments using only the amino acid sequence information. Each amino acid in a protein sequence is represented by a Compositional Index, which is deduced from a combination of the difference in amino acid occurrences in TMH and non-TMH segments in training protein sequences and the amino acid composition information. Furthermore, a genetic algorithm was employed to find the optimal threshold value for the separation of TMH segments from non-TMH segments. The method successfully predicted 376 out of the 378 TMH segments in a dataset consisting of 70 test protein sequences. The sensitivity and specificity for classifying each amino acid in every protein sequence in the dataset was 0.901 and 0.865, respectively. To assess the generality of TMHindex, we also tested the approach on another standard 73-protein 3D helix dataset. TMHindex correctly predicted 91.8% of proteins based on TM segments. The level of the accuracy achieved using TMHindex in comparison to other recent approaches for predicting the topology of TM proteins is a strong argument in favor of our proposed method. Availability: The datasets, software together with supplementary materials are available at: http://faculty.uaeu.ac.ae/nzaki/TMHindex.htm

    Identification of Giardia lamblia DHHC Proteins and the Role of Protein S-palmitoylation in the Encystation Process

    Get PDF
    Protein S-palmitoylation, a hydrophobic post-translational modification, is performed by protein acyltransferases that have a common DHHC Cys-rich domain (DHHC proteins), and provides a regulatory switch for protein membrane association. In this work, we analyzed the presence of DHHC proteins in the protozoa parasite Giardia lamblia and the function of the reversible S-palmitoylation of proteins during parasite differentiation into cyst. Two specific events were observed: encysting cells displayed a larger amount of palmitoylated proteins, and parasites treated with palmitoylation inhibitors produced a reduced number of mature cysts. With bioinformatics tools, we found nine DHHC proteins, potential protein acyltransferases, in the Giardia proteome. These proteins displayed a conserved structure when compared to different organisms and are distributed in different monophyletic clades. Although all Giardia DHHC proteins were found to be present in trophozoites and encysting cells, these proteins showed a different intracellular localization in trophozoites and seemed to be differently involved in the encystation process when they were overexpressed. dhhc transgenic parasites showed a different pattern of cyst wall protein expression and yielded different amounts of mature cysts when they were induced to encyst. Our findings disclosed some important issues regarding the role of DHHC proteins and palmitoylation during Giardia encystation.Fil: Merino, Maria Cecilia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra. Universidad Nacional de Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra; ArgentinaFil: Zamponi, Nahuel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra. Universidad Nacional de Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra; ArgentinaFil: Vranych, Cecilia Verónica. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra. Universidad Nacional de Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra; ArgentinaFil: Touz, Maria Carolina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra. Universidad Nacional de Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra; ArgentinaFil: Ropolo, Andrea Silvana. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra. Universidad Nacional de Córdoba. Instituto de Investigación Médica Mercedes y Martín Ferreyra; Argentin

    Mechanistic Insights into a Novel Exporter-Importer System of Mycobacterium tuberculosis Unravel Its Role in Trafficking of Iron

    Get PDF
    Elucidation of the basic mechanistic and biochemical principles underlying siderophore mediated iron uptake in mycobacteria is crucial for targeting this principal survival strategy vis-à-vis virulence determinants of the pathogen. Although, an understanding of siderophore biosynthesis is known, the mechanism of their secretion and uptake still remains elusive.Here, we demonstrate an interplay among three iron regulated Mycobacterium tuberculosis (M.tb) proteins, namely, Rv1348 (IrtA), Rv1349 (IrtB) and Rv2895c in export and import of M.tb siderophores across the membrane and the consequent iron uptake. IrtA, interestingly, has a fused N-terminal substrate binding domain (SBD), representing an atypical subset of ABC transporters, unlike IrtB that harbors only the permease and ATPase domain. SBD selectively binds to non-ferrated siderophores whereas Rv2895c exhibits relatively higher affinity towards ferrated siderophores. An interaction between the permease domain of IrtB and Rv2895c is evident from GST pull-down assay. In vitro liposome reconstitution experiments further demonstrate that IrtA is indeed a siderophore exporter and the two-component IrtB-Rv2895c system is an importer of ferrated siderophores. Knockout of msmeg_6554, the irtA homologue in Mycobacterium smegmatis, resulted in an impaired M.tb siderophore export that is restored upon complementation with M.tb irtA.Our data suggest the interplay of three proteins, namely IrtA, IrtB and Rv2895c in synergizing the balance of siderophores and thus iron inside the mycobacterial cell

    Systematic search for putative new domain families in Mycoplasma gallisepticum genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein domains are the fundamental units of protein structure, function and evolution. The delineation of different domains in proteins is important for classification, understanding of structure, function and evolution. The delineation of protein domains within a polypeptide chain, namely at the genome scale, can be achieved in several ways but may remain problematic in many instances. Difficulties in identifying the domain content of a given sequence arise when the query sequence has no homologues with experimentally determined structure and searching against sequence domain databases also results in insignificant matches. Identification of domains under low sequence identity conditions and lack of structural homologues acquire a crucial importance especially at the genomic scale.</p> <p>Findings</p> <p>We have developed a new method for the identification of domains in unassigned regions through indirect connections and scaled up its application to the analysis of 434 unassigned regions in 726 protein sequences of <it>Mycoplasma gallisepticum </it>genome. We could establish 71 new domain relationships and probable 63 putative new domain families through intermediate sequences in the unassigned regions, which importantly represent an overall 10% increase in PfamA domain annotation over the direct assignment in this genome.</p> <p>Conclusions</p> <p>The systematic analysis of the unassigned regions in the <it>Mycoplasma gallisepticum </it>genome has provided some insight into the possible new domain relationships and putative new domain families. Further investigation of these predicted new domains may prove beneficial in improving the existing domain prediction algorithms.</p

    Functional discrimination of membrane proteins using machine learning techniques

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Discriminating membrane proteins based on their functions is an important task in genome annotation. In this work, we have analyzed the characteristic features of amino acid residues in membrane proteins that perform major functions, such as channels/pores, electrochemical potential-driven transporters and primary active transporters.</p> <p>Results</p> <p>We observed that the residues Asp, Asn and Tyr are dominant in channels/pores whereas the composition of hydrophobic residues, Phe, Gly, Ile, Leu and Val is high in electrochemical potential-driven transporters. The composition of all the amino acids in primary active transporters lies in between other two classes of proteins. We have utilized different machine learning algorithms, such as, Bayes rule, Logistic function, Neural network, Support vector machine, Decision tree etc. for discriminating these classes of proteins. We observed that most of the algorithms have discriminated them with similar accuracy. The neural network method discriminated the channels/pores, electrochemical potential-driven transporters and active transporters with the 5-fold cross validation accuracy of 64% in a data set of 1718 membrane proteins. The application of amino acid occurrence improved the overall accuracy to 68%. In addition, we have discriminated transporters from other α-helical and β-barrel membrane proteins with the accuracy of 85% using k-nearest neighbor method. The classification of transporters and all other proteins (globular and membrane) showed the accuracy of 82%.</p> <p>Conclusion</p> <p>The performance of discrimination with amino acid occurrence is better than that with amino acid composition. We suggest that this method could be effectively used to discriminate transporters from all other globular and membrane proteins, and classify them into channels/pores, electrochemical and active transporters.</p

    Transmembrane protein topology prediction using support vector machines

    Get PDF
    Background: Alpha-helical transmembrane (TM) proteins are involved in a wide range of important biological processes such as cell signaling, transport of membrane-impermeable molecules, cell-cell communication, cell recognition and cell adhesion. Many are also prime drug targets, and it has been estimated that more than half of all drugs currently on the market target membrane proteins. However, due to the experimental difficulties involved in obtaining high quality crystals, this class of protein is severely under-represented in structural databases. In the absence of structural data, sequence-based prediction methods allow TM protein topology to be investigated.Results: We present a support vector machine-based (SVM) TM protein topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of 131 sequences with known crystal structures. The method achieves topology prediction accuracy of 89%, while signal peptides and re-entrant helices are predicted with 93% and 44% accuracy respectively. An additional SVM trained to discriminate between globular and TM proteins detected zero false positives, with a low false negative rate of 0.4%. We present the results of applying these tools to a number of complete genomes. Source code, data sets and a web server are freely available from http://bioinf.cs.ucl.ac.uk/psipred/.Conclusion: The high accuracy of TM topology prediction which includes detection of both signal peptides and re-entrant helices, combined with the ability to effectively discriminate between TM and globular proteins, make this method ideally suited to whole genome annotation of alpha-helical transmembrane proteins

    Predicting protein-protein binding sites in membrane proteins

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many integral membrane proteins, like their non-membrane counterparts, form either transient or permanent multi-subunit complexes in order to carry out their biochemical function. Computational methods that provide structural details of these interactions are needed since, despite their importance, relatively few structures of membrane protein complexes are available.</p> <p>Results</p> <p>We present a method for predicting which residues are in protein-protein binding sites within the transmembrane regions of membrane proteins. The method uses a Random Forest classifier trained on residue type distributions and evolutionary conservation for individual surface residues, followed by spatial averaging of the residue scores. The prediction accuracy achieved for membrane proteins is comparable to that for non-membrane proteins. Also, like previous results for non-membrane proteins, the accuracy is significantly higher for residues distant from the binding site boundary. Furthermore, a predictor trained on non-membrane proteins was found to yield poor accuracy on membrane proteins, as expected from the different distribution of surface residue types between the two classes of proteins. Thus, although the same procedure can be used to predict binding sites in membrane and non-membrane proteins, separate predictors trained on each class of proteins are required. Finally, the contribution of each residue property to the overall prediction accuracy is analyzed and prediction examples are discussed.</p> <p>Conclusion</p> <p>Given a membrane protein structure and a multiple alignment of related sequences, the presented method gives a prioritized list of which surface residues participate in intramembrane protein-protein interactions. The method has potential applications in guiding the experimental verification of membrane protein interactions, structure-based drug discovery, and also in constraining the search space for computational methods, such as protein docking or threading, that predict membrane protein complex structures.</p

    Differential effects of human and plant N-acetylglucosaminyltransferase I (GnTI) in plants

    Get PDF
    In plants and animals, the first step in complex type N-glycan formation on glycoproteins is catalyzed by N-acetylglucosaminyltransferase I (GnTI). We show that the cgl1-1 mutant of Arabidopsis, which lacks GnTI activity, is fully complemented by YFP-labeled plant AtGnTI, but only partially complemented by YFP-labeled human HuGnTI and that this is due to post-transcriptional events. In contrast to AtGnTI-YFP, only low levels of HuGnTI-YFP protein was detected in transgenic plants. In protoplast co-transfection experiments all GnTI-YFP fusion proteins co-localized with a Golgi marker protein, but only limited co-localization of AtGnTI and HuGnTI in the same plant protoplast. The partial alternative targeting of HuGnTI in plant protoplasts was alleviated by exchanging the membrane-anchor domain with that of AtGnTI, but in stably transformed cgl1-1 plants this chimeric GnTI still did not lead to full complementation of the cgl1-1 phenotype. Combined, the results indicate that activity of HuGnTI in plants is limited by a combination of reduced protein stability, alternative protein targeting and possibly to some extend to lower enzymatic performance of the catalytic domain in the plant biochemical environment
    corecore